Targeted single-cell genomics reveals novel host adaptation strategies of the symbiotic bacteria Endozoicomonas in Acropora tenuis coral

Background Endozoicomonas bacteria symbiosis with various marine organisms is hypothesized as a potential indicator of health in corals. Although many amplicon analyses using 16S rRNA gene have suggested the diversity of Endozoicomonas species, genome analysis has been limited due to contamination of host-derived sequences and difficulties in culture and metagenomic analysis. Therefore, the evolutionary and functional potential of individual Endozoicomonas species symbiotic with the same coral species remains unresolved. Results In this study, we applied a novel single-cell genomics technique using droplet microfluidics to obtain single-cell amplified genomes (SAGs) for uncultured coral-associated Endozoicomonas spp. We obtained seven novel Endozoicomonas genomes and quantitative bacterial composition from Acropora tenuis corals at four sites in Japan. Our quantitative 16S rRNA gene and comparative genomic analysis revealed that these Endozoicomonas spp. belong to different lineages (Clade A and Clade B), with widely varying abundance among individual corals. Furthermore, each Endozoicomonas species possessed various eukaryotic-like genes in clade-specific genes. It was suggested that these eukaryotic-like genes might have a potential ability of different functions in each clade, such as infection of the host coral or suppression of host immune pathways. These Endozoicomonas species may have adopted different host adaptation strategies despite living symbiotically on the same coral. Conclusions This study suggests that coral-associated Endozoicomonas spp. on the same species of coral have different evolutional strategies and functional potentials in each species and emphasizes the need to analyze the genome of each uncultured strain in future coral-Endozoicomonas relationships studies. Video Abstract Supplementary Information The online version contains supplementary material available at 10.1186/s40168-022-01395-9.

globally recognized. In the functional analysis of coral symbionts, symbiotic dinoflagellates have been the main target, but recently, the functions of coral-associated bacteria have also been gradually clarified, including the degradation of dead symbiotic dinoflagellates [3], provision of nitrogen sources [4], and protection from pathogenic bacteria via antibiotic activities [5].
The genus Endozoicomonas has been found to be abundant in corals worldwide and is associated with an extensive range of marine organisms, including shellfish [6,7], sea slugs [8], sponges [9,10], and sea anemones [11,12]. Endozoicomonas has been the most widely studied coral-associated bacterium, and it has been shown that phylogenetically different Endozoicomonas species colocalize within a single individual [3,13]. In addition, Endozoicomonas has been reported to play several functional roles in coral survival, such as suppressing mitochondrial degradation, supplying sugars, and contributing to the coral sulfur cycle through dimethylsulphoniopropionate (DMSP) [14,15]. However, little is known about the mechanisms by which Endozoicomonas establishes symbiotic relationships with its host corals. Recently, Ding et al. investigated the host cell entry mechanism of Endozoicomonas montiporae in Montipora coral. They hypothesized that this bacterium used eukaryotic-like genes, such as ephrin ligands, to enter the host cell [14]. This adaptation strategy is consistent with the fact that Endozoicomonas is not an obligate symbiont [14,16]. Endozoicomonas diversity has been studied at the 16S rRNA gene level, but not yet at the genome level. Therefore, there are method-dependent gaps in our understanding of Endozoicomonas diversity. Furthermore, comparative genome analysis of Endozoicomonas spp. from the same host coral species can reveal the evolutionary and adaptive strategies of each Endozoicomonas. However, these understandings require genomic information on many Endozoicomonas from the same host coral species.
The lack of genomic information is due to the difficulty in culturing host-associated bacteria [16]. Thus, culture-independent approaches are needed for studying genome data to understand the relationship between Endozoicomonas and its host. Currently, shotgun metagenomics with binning is the mainstream method for microbial genome sequence analysis, but this method is greatly influenced by microbiome composition and diversity [17].In most cases, especially in genome analysis of closely related species with highly similar 16S rRNA gene sequences, it is impossible to identify individual strain-level draft genome sequences by metagenomic binning [18]. Furthermore, detailed genome sequence analysis of host-associated bacteria is complicated because contamination by host also significantly affects binning efficiency [16,[19][20][21]. As an alternative to metagenomics, single-cell genomics is very useful, as it can reveal the genome diversity of individual cells [22][23][24] from a mixed cell population. On the other hand, in the current single-cell genomics, single-cells are randomly sorted for subsequent analysis. Therefore, when host-derived sequences are contaminated as in this case, the efficiency of acquiring target microbe is greatly reduced. Deep sequencing was required to obtain multiple genome sequences to reveal genome diversity, which required wasteful cost and effort.
Here, we suggest a novel approach for targeted single-cell genomics by detecting the target gene using droplet microfluidics. In this study, we identified seven draft genomes of Endozoicomonas spp. from A. tenuis sampled at four different sites in Okinawa Prefecture, Japan. We then investigated the genomic differences between different clades of Endozoicomonas symbiotic with A. tenuis coral and examined the role of eukaryotic-like genes. Finally, we proposed a symbiotic strategy model for Endozoicomonas bacteria in A. tenuis corals, which revealed a diverse adaptation process.
The results of single-cell genome analysis indicate the existence of at least two strategies for microbehost interactions, indicating the possibility of microbial regulation of host metabolic systems. These findings may provide insights into the robustness of coral reefs.  1A). Official permission to collect coral samples was obtained from Okinawa Prefecture in accordance with Okinawa Prefecture Fishing Regulation. The level of coral bleaching was evaluated on a 3-point scale (0: no bleaching, 1: slight bleaching, 2: complete bleaching). Small pieces of branches (~3 cm) were collected from A. tenuis corals and kept in a sterile zipper bag (140×100×0.04 mm; SEISANNIPPONSHA Ltd., Japan) filled with seawater from each sampling site. For 16S rRNA gene amplicon sequencing, coral branches were immediately transferred into a 5-mL tube (Eppendorf Ltd., Germany) containing 4 mL of RNAlater (Thermo Fisher Scientific, Waltham, MA, USA) and then frozen with liquid nitrogen. The frozen samples were kept in a −80°C deep freezer until analysis. For single-cell genome sequencing, the branches were kept on ice until just before the subsequent process.

DNA extraction and library preparation
The frozen coral branch was thawed on ice. The coral branch was then picked with tweezers and washed with artificial seawater. Then, in a sterilized zipper bag, the coral tissue was blown off using a Waterpik (EW-DJ61-W, Panasonic Corp., Japan), and the suspensions were collected into a 50-mL tube (Nippon Genetics Co., Ltd., Tokyo, Japan). After centrifugation at 10,000×g, 4°C for 30 min, the supernatant was removed, and the pellet was resuspended in 500 μL of artificial seawater. The suspension was transferred into a 1.5-mL tube (SSIBio, Lodi, CA, USA) and centrifuged at 15,000×g, 4°C for 15 min. After the supernatant was removed, the pellet was stored at −80°C until DNA extraction. DNeasy Plant Mini Kit (69104; QIAGEN, Hilden, Germany) was used for DNA extraction according to the manufacturer's instruction, with a small modification. The thawed pellet was mixed with 400 μL of Buffer AP1 and transferred into 2-mL screw cap tubes with stabilized 0.1 mm zirconia beads (Yasui Kikai Corp., Osaka, Japan). The tubes were homogenized three times at 2500 rpm for 60 s at 60-s intervals. After centrifugation, 4 μL of RNase A was added, and the sample was incubated at 65°C for 10 min before DNA extraction was performed following the kit protocol.

Each color indicates
Endozoicomonas and is shown for amplicon sequence variants that are above 8% in any sample. C Heatmap plot based on 16S rRNA gene copy number denoting Endozoicomonas abundance. The copy number of 16S rRNA gene is normalized by the surface area of the coral. The uncultured Gammaproteobacteria clade (black) is displayed as the outer group. The star shows corals that did not bleach in August 2016. Bleaching was recorded in Ishikawabaru and Sesoko Minami, but not in Yomitan and Onna DADA2 (via q2-dada2) [26]. All ASVs were aligned using MAFFT (via q2-alignment) [27]. Phylogenetic tree was constructed based on masked aligned ASVs using FastTree2 (via q2-mask and q2-phylogeny) [28]. Taxonomic assignment was performed using the q2-feature-classifier [29] classify-sklearn naïve Bayes taxonomy classifier against the Silva 138 99% OTUs full-length sequences [30]. The downstream analysis was performed using the R bioconductor phyloseq [31] and ggtree [32] in R version 3.5.2.

Droplet digital PCR for quantification of 16S rRNA gene copy number
The PCR mixture contained 10 μL of 2× ddPCR Supermix for Probes -3′, respectively. The droplets were collected into 0.2-mL tube and then thermal cycled using a T100 ™ Thermal Cycler (Bio-Rad) with the following protocol: 95°C for 10 min, followed by 40 cycles of 94°C for 30 s and 60°C for 1 min, 95°C for 10 min, and maintenance at 4°C. Subsequently, the droplets were reintroduced into microfluidic devices for digital counting. We have developed a system for calculating the rate of fluorescencepositive droplets using a 488-nm laser (Lambda mini, Tokyo Instruments, Tokyo, Japan) ( Supplementary Fig.  S1). Fluorescent signals from the droplets were detected by a photosensor module (H11902-01; Hamamatsu Photonics KK, Hamamatsu, Japan) and recorded by an oscilloscope (PICOSCOPE 2204A; Pico Technology Ltd., St. Neots, Cambridgeshire, UK). Combined with the information of droplet diameter measured with ImageJ, 16S rRNA gene copy number per 1 ng of extracted DNA was calculated [33].

Measurement of surface area of coral skeletons by CT scan
The coral branches with surface tissue removed by water picking were used for calculating the surface area. 3D images of the coral branches were obtained using an X-Ray CT scanner (TDM1300-IS; Yamato Scientific, Tokyo, Japan). The surface area was calculated by the Mimics software (Materialise, Leuven, Belgium). The 16S rRNA gene copy number per 1 cm 2 of surface area was calculated.

Preparation of bacterial suspension for single-cell genome sequencing
A small piece of coral branch was collected and kept in a 25-mL tube (Iwaki, Tokyo, Japan) filled with seawater from each sampling site. The tube was kept on ice immediately. Each branch was transferred into 5 mL of 0.22-μL-filtered UV-treated seawater of each sampling site and crushed thoroughly by a scalpel (Feather Safety Razor Co. Ltd., Japan). After 5 min of standing on ice, the supernatant was collected into three 1.5-mL tubes. The tube was subsequently centrifuged at 300×g for 5 min, and the supernatant was transferred into a new 1.5-mL tube to remove symbiotic dinoflagellates and other larger particles. The collected supernatant was centrifuged at 8000×g for 5 min. The supernatant was removed, and 500 μL of 0.22-μL-filtered UV-treated seawater was added to resuspend the pellet. The centrifugation at 300×g was repeated until most of the symbiotic dinoflagellates were removed. The centrifugation at 8000×g and the washing steps were repeated three times. Next, the pellet was resuspended in 500 μL of 1x SYBR Green (Thermo Fisher Scientific, Hampton, NH) for 5 min for DNA staining. After centrifugation at 8000×g, the pellet was resuspended in 50 μL of 0.22-μL-filtered UV-treated seawater, and the cell concentration was calculated with a bacterial counter (SLGC, Tokyo, Japan) under a fluorescent microscope (CKX53; Olympus Optical Co Ltd., Tokyo, Japan).

Whole genome amplification and detection of Endozoicomonas in gel beads
Cells were encapsulated into 40-μm microfluidic droplets at a concentration of 0.3 cell/droplet using a microfabricated microfluidic device reported previously [22]. The theoretical number of encapsulated cells in a droplet was derived by the Poisson distribution (Formula 1). The percentage of doublets would be less than 3.7% of the total droplets when the cell concentration was 0.3 cell/droplet.
Next, gel-bead-based whole genome amplification (WGA) was conducted according to the previously reported WGA method (single-cell amplified genomes (SAGs) in the gel are referred to as SAG-gel) [21,34]. After WGA, the gel beads were stained with DAPI, and amplification of the DNA was confirmed with microscopic observation (CKX53; Olympus). Subsequently,   Table 2). All raw reads were trimmed to remove adapter sequences by fastp v0.20.0 with default parameters [36]. Trimmed reads were assembled using SPAdes v3.12.0 with "-k auto --sc --careful" parameters [37]. All SAGs were concatenated using ccSAG [38]. SAGs with an average similarity of > 99.9% in single-copy marker genes were co-assembled and used for subsequent analyses. Contigs less than 1000 bp in the SAG were removed. We removed host-and human-derived contaminations, which were annotated as eukaryote by BLAST against the NCBI nt database, from the SAG [39]. The quality of the SAGs was checked using CheckM v1.0.13 [40]. For the subsequent analyses, only high-quality SAG data were used (85% > completeness, < 5% contamination).

Eukaryotic-like protein analysis
All the Pfam IDs with a Z-score of 10,000 in EffectiveELD version 5.2 were extracted from the InterProScan results [57]. For extracting eukaryotic-like genes based on sequence homology, genes were selected by taxonomic annotation using eggNOG-mapper v.1.0.3 [48]. In addition, we performed a taxonomy annotation of eukaryotic-like genes using MMseqs2 with "--tax-lineage 1 --vote-mode 0" options, using the eukaryotic-only NCBI nr database [58]. We also checked that these genes are not annotated with other bacteria to confirm that Endozoicomonas acquired the genes from eukaryotes during host adaptation using the NCBI nr database without the Endozoicomonas genus. Major hits of annotation result were segmented annelid worms Capitella teleta. These genes were considered to be derived from contamination and removed from the results.

Diversity analysis of the Endozoicomonas genus in Acropora tenuis
A. tenuis were collected at four locations (Ishikawabaru, Sesoko Minami, Yomitan, and Onna) in Okinawa in May and August 2016 (Fig. 1A). These four locations are placed within a 15-km radius. Through 16S rRNA gene sequencing, 1418 ASVs were detected as coral-associated bacteria, and 209 ASVs were classified as Endozoicomonas spp. In Ishikawabaru, Yomitan, and Onna, Endozoicomonas spp. was dominant, representing 70.7-99.7% of the detected bacteria (Fig. 1B). However, in Sesoko Minami, the proportion of Endozoicomonas spp. was only 1.8-50.3%, and order Rickettsiales was dominant instead (Supplementary Figure 2A), including Rickettsiales bacteria strain SESOKO1, which was previously reported by our group [59]. The results of principal coordinate analysis (PCoA) based on the Bray-Curtis distance of Endozoicomonas composition showed spatial specificity for A. tenuis (Supplementary Figure 2B and C).
Phylogenetic analysis of the 16S rRNA gene revealed that the Endozoicomonas species can be divided into three major clades: Clade A1 (Red) and Clade A2 (Green) constituted the Onna and Yomitan samples, whereas Clade B (Blue) comprised the Ishikawabaru and Sesoko Minami samples (Fig. 1C, phylogenetic tree and heatmap). Absolute abundance of 16S rRNA gene was calculated by the 16S rRNA gene copy number measured by ddPCR and the surface area value obtained by CT scan. The average (±sd) number of Endozoicomonas was 37,952 ± 75,836 copies/mm 2 (Ishikawabaru), 2306.8 ± 4241.6 copies/mm 2 (Sesoko Minami), 30,180 ± 30,165 copies/mm 2 (Onna), and 37,287 ± 29,083 copies/mm 2 (Yomitan), respectively (Fig. 1C, bar plot). These results suggest that the cell numbers of Endozoicomonas spp. vary drastically from site to site. Especially at Sesoko Minami, where the copy number of 16S rRNA gene was exceedingly low, the absolute number of Endozoicomonas was significantly lower than that at other sites. In August 2016, large-scale coral bleaching was observed around Okinawa Prefecture; however, the bleaching level of coral colonies at Sesoko Minami was lower than that at Ishikawabaru [60]. In particular, the single coral colony, in which Endozoicomonas Clade A1 was dominant, did not show any bleaching in August 2016 ( Fig. 1C; star). In our previous study, the density per surface area of zooxanthellae in corals was greatly reduced during coral bleaching in August 2016 [60]. However, in the same coral, the absolute number of Endozoicomonas did not decrease significantly from May to August in 2016 (Fig. 1C).

Single-cell genome sequencing of Endozoicomonas bacteria
To selectively obtain genomic information of target bacteria from crushed coral tissue suspensions, we have developed a method for targeted single-cell genomics using droplet microfluidics. Bacterial suspensions prepared from coral tissues were encapsulated into agarose gel droplets and processed to gel-bead-based WGA to obtain SAGs [24] (Fig. 2A). Microscopic observation revealed that 24.4-26.3% of the gel beads exhibited DAPI fluorescence, which corresponded to the Poisson distribution value. Subsequently, PCR-based detection of target genes with FAM probes was conducted, which enabled the identification of gel beads containing target SAGs (Fig. 2B). After PCR amplification, there were droplets displaying both DAPI and FAM fluorescence (Fig. 2C). The percentage of DAPI fluorescence-positive droplets is 0.8-1.2%, which represents the rate of droplets encapsulating the target bacterium.
The results of 16S rRNA gene sequencing followed by second-round WGA showed that 64% (28 out of 44) of the samples corresponded to the Endozoicomonas sequence. Among the remaining 16 samples, 13 exhibited insufficient DNA yield (< 2 ng/μL) after the second-round WGA. The remaining three were confirmed to be derived from host mitochondria. When droplets that were positive only for DAPI were isolated, all the sequence corresponded to the sequence of host mitochondria (n = 30), suggesting that our targeted single-cell genomics has a low false-negative rate. In order to obtain 28 target bacterial SAGs by random sampling, 2333-3500 droplets need to be isolated according to the calculation, which means that our method showed 53.0-79.5 times higher screening efficiency than the random sampling.
As a result, seven draft SAGs ranging from 2.1 to 6.4 Mbp were acquired from four sampling sites, three of which were classified as high-quality and three of which were classified as medium-quality according to the criteria by the Genomic Standards Consortium (GSC) ( Table 1).

Comparative genome analysis of Endozoicomonas genus
We conducted a comparative genome analysis using our five draft SAGs, which include four from this study and one from our previous report [60], collected from A. tenuis, and 10 public Endozoicomonas isolates collected from various kinds of marine organisms. Phylogenetic trees constructed with single-copy marker genes showed that Endozoicomonas spp. represented single clades that corresponded to symbiotic host organisms. However, Endozoicomonas spp. collected from Acropora coral (Endozoicomonas sp. ISHI1, SESOKO1, ONNA1, ONNA2, and YOMI1) were divided into two clades (clade A1 and B) (Fig. 3A), which was also confirmed by the ANI and average AAI (Supplementary Figure 3).
A similar trend was observed in the ortholog analysis. Out of a total of 9414 orthogroups, the number of orthogroups commonly detected in all Endozoicomonas spp. was 1148. We confirmed that the bacterial secretion systems are evolutionarily conserved in all Endozoicomonas symbiotic relationships with marine organisms (Supplementary Table 3), which are important for Endozoicomonas to secrete proteins related to eukaryotic pathways in the host. The prediction of effector proteins showed that 12.5-20.4% (T3SS) and 2.97-4.22% (T4SS) of proteins were released by bacterial secretion systems (Supplementary Table 4). On the other hand, the remaining 8266 orthogroups and 8061 unassigned genes were clade-or strain-specific, suggesting that Endozoicomonas spp. possess a diverse orthogroup profile. It appears that the profile of the orthogroups depends on the host of each Endozoicomonas sp. However, clade A1 and clade B, which were symbiotic with the same host (A. tenuis), shared no unique orthogroups.
Enrichment analysis of accessory genes to the KEGG pathway revealed the functional characteristics and differences of each clade (clade A1 and clade B). Of the top ten enriched pathways (nine pathway in Clade B) of each clade, eight pathways (except bacterial chemotaxis and flagellar assembly) in clade A1 as well as seven pathways (except sphingolipid metabolism and terpenoid backbone biosynthesis) in clade B were assigned as eukaryotic pathways (Fig. 3B, C). Primarily, Fig. 2 Targeted single-cell genomics with droplet microfluidics. A Microbial cells were encapsulated into 40-μm microfluidic droplets at the single-cell level (0.3 cell/droplet) with ultra-low melting temperature agarose solutions. Collected droplets were solidified, treated with lysis solution, and subjected to whole genome amplification (WGA) using a previously described single-cell amplified genome (SAG)-gel platform. After washing, whole genome amplified gel beads were resuspended in the PCR mixture. B Gel beads suspended in PCR mixture were reintroduced into microfluidic devices, and water in oil (W/O) droplets encapsulating gel beads were generated. After thermal cycling, droplets with target sequence exhibited fluorescence derived from Taqman probes. The fluorescence-positive droplets were isolated into multi-well plates and then subjected to second-round WGA and library preparation. C Microscopic images of W/O droplets after PCR. Droplets with genome amplification by WGA exhibit blue fluorescence derived from DAPI. Droplets with target gene amplification by PCR exhibit green fluorescence derived from FAM. The droplet indicated by the arrowhead contains genome amplicons of the target bacterium  [60] in Endozoicomonas clade A1, immune-related pathways were enriched, such as the TRAF signaling pathways and NF-κB pathways (Fig. 3B), which are generally not conserved in bacteria. Conversely, in clade B, these immune-related pathways were not enriched, but proteasomes and proteolysis pathways including eukaryotic-like E3 ubiquitin ligase gene were enriched (Fig. 3C).

Different symbiotic strategies utilizing eukaryotic-like genes
We further conducted a comparative genome analysis focusing on eukaryotic-like domains and found that all Endozoicomonas spp. used in this study had at least one eukaryotic-like domain (Fig. 4A). The types of eukaryotic-like domains were different between clade A1 (YOMI1, ONNA1, and ONNA2) and clade B (ISHI1 and SESEKO1) Endozoicomonas; namely, the eukaryotic-like domain found in common between clade A1 and clade B was only RING-type zinc finger (RINC-type zinc-finger, Zinc-finger double domain, Zinc finger, and C3HC4 type were grouped), and the remaining six domains (SOCS box, Inhibitor of Apoptosis domain, ephrin ligand, Ets domain, MATH domain, and FANCLE C-terminal domain) were found only in either species (Fig. 4A, Supplementary Table 5). We have previously reported that Endozoicomonas sp. ISHI1 has coral-like ephrin ligand genes [60]. In addition, there were other eukaryotic-like genes (such as the SNARE domain, Vps4 C terminal oligomerization domain-containing genes). In contrast, Endozoicomonas sp. YOMI1, ONNA1, and ONNA2, which belong to clade A1, did not possess ephrin ligand, but had eukaryotic-like genes, including MATH/TRAF, SOCS box, an Ets-domain, and an inhibitor of apoptosis domain (Fig. 4A). Furthermore, we found 532 genes that did not contain eukaryotic-like domains but were highly homologous against eukaryote in A. tenuis-symbiotic Endozoicomonas (YOMI1: 125 genes, ONNA1: 134 genes, ONNA2: 47 genes, SESOKO1: 90 genes, ISHI1: 136 genes) (Supplementary Table 6). As genes closely related to host A. tenuis coral genes, TRAF/MATH domain-containing genes (match: 26.0-62.2%), low-density lipoprotein receptor gene (match: 30.9-67.6%), and transcription factor 15-like gene (match: 27.5-30.4%) were detected in Endozoicomonas clade A1. In contrast, the genes for ephrin ligand (match: 31.3-46.0%) and RING finger and CHY zinc finger domain-containing gene (match: 46.1-64.3%) were detected in clade B. Eukaryotic-like E3 ubiquitin-protein ligase was detected as a common gene in both clades (Supplementary Table 5). Some of these eukaryotic-like genes were inserted into the genome (e.g., survivin gene, ephrin gene) (Fig. 4B). Furthermore, these genes did not contain introns, despite their similarity to the intron-containing host genes (Supplementary Figure 4).

The absolute abundance of Endozoicomonas varies greatly among coral habitats
Previous studies have focused solely on bacterial composition and failed to establish a relationship between the amount of coral symbiotic bacteria and their host conditions. To overcome this limitation, this study quantified the absolute 16S rRNA gene copy number of bacteria and revealed the abundance of Endozoicomonas spp., which varied by more than 72-fold among sampling sites. Several researchers have considered that the abundance of Endozoicomonas is related to health and robustness of corals [65,66]. However, in August 2016, corals from Ishikawabaru, which had a higher abundance of Endozoicomonas than Sesoko Minami, showed far extensive bleaching, which was not consistent with their findings [67]. Our previous studies also showed that inflammation-related genes are upregulated in A. tenuis, which is occupied by clade B Endozoicomonas spp. In this study, we measured quantitative Endozoicomonas abundance, which may be involved in the host side response at the gene expression level. The individual adaptive capacity of Endozoicomonas may also have an effect on the host [68]. For a deeper understanding of Endozoicomonas function in Acropora coral, the difference of the clade or absolute amount of Endozoicomonas should be considered.

Targeted single-cell genomics enables selective accumulation of low-abundance bacterial genomes
In addition to the 16S rRNA gene-based phylogenetic analysis, we conducted single-cell whole genome sequencing of target bacteria and collected seven SAGs belonging to clade A1 (Endozoicomonas sp. YOMI1, ONNA1, and ONNA2) and clade B (Endozoicomonas sp. SESOKO1 and ISHI1). Although SAGs consisted of fragmented contigs, and the total length of some SAGs were shorter compared to the genome size of reference Endozoicomonas, the completeness was comparable to cultured strains reported previously [15], providing evidence that this method can expand genomic information of uncultured bacteria. It is noteworthy that our method enabled selective acquisition of draft SAGs even when the abundance ratio of target bacteria was approximately 1%, thereby overcoming the disadvantage of metagenomics. Moreover, even in the conventional single-cell genomics, the composition of sequenced bacterial taxa is highly dependent on the proportion of bacteria in the target environment because single cells are randomly sorted [69]. In contrast, in our proposed method, WGA was followed by detection of target gene sequence (Endozoicomonas 16S rRNA gene) in gel beads to achieve selective isolation and subsequent genome analysis. To the best of our knowledge, this is the first report of targeted single-cell genomics on environmental microbes. Our method showed > 50 times higher screening efficiency of target bacteria than the random sampling, thus drastically reducing the amount of reagents and labor wasted on non-target samples. Our targeted single-cell genomics can also be used to detect specific gene sequences obtained from metagenomic shotgun analysis or 16S rRNA gene sequencing by designing specific primers and probes. As a further application, specific single-cell genomics with unique characteristics will be possible by targeting the sequences of biosynthetic gene clusters and drug-resistant genes. As the potential of rare bacteria to exert important functions in microbial populations is gaining increasing attention [70], our method can be a technological innovation and breakthrough for the analysis of single-cell genomics. In the single-cell genomics of environmental bacteria, it is often important to process the samples immediately after collection to obtain highquality draft SAGs. Because the microfluidic system used in this study is portable, on-site single-cell isolation and genome amplification can be performed with standard biological laboratory equipment, which is critical for obtaining high-quality SAGs.

Diversity of host adaptation strategies in Endozoicomonas
Several 16S rRNA gene amplicon and shotgun metagenomic analyses have suggested that coral-associated Endozoicomonas spp. may improve host coral health [65,66]. However, owing to the lack of reference genomes, detailed analysis of clade-specific functions in Endozoicomonas in the same coral species has not been performed. Therefore, we used our SAG data and compared the gene functions detected in each clade of Endozoicomonas. Comparative genome analysis revealed that the orthogroups of Endozoicomonas were conserved in each clade. Notably, even when the different clades of Endozoicomonas coexisted with A. tenuis, the shared orthogroups were only the core gene between clade A1 (Endozoicomonas sp. YOMI1, ONNA1, and ONNA2) and clade B (Endozoicomonas sp. ISHI1 and SESOKO1). These results suggest that Endozoicomonas independently adapts to its host during the process of evolution.
A comprehensive search for eukaryotic-like genes in Endozoicomonas revealed that various eukaryoticlike genes were enriched in a lineage-specific manner. Symbiotic bacteria are known to harbor eukaryoticlike domains, such as ankyrin repeat and WD40 repeat, which are involved in protein-protein interactions and are thought to be necessary for interaction with the host [71,72]. For example, in plants and sponges, many symbiotic bacteria and pathogens intervene in host signaling pathways via eukaryotic-like genes that mimic host genes, thereby influencing host conditions [73,74]. There have been few reports on how symbiotic bacteria acquire these eukaryotic-like genes, but in our study, some of these genes (e.g., ephrin ligand and survivin) were found to be inserted with spliced host coral genes (Supplementary Figure 4). This result suggests the bacterial-eukaryotic horizontal gene transfer is observed between host coral and coralassociated Endozoicomonas spp. One of many possible hypotheses is that the insertion of these genes may be mediated by host mRNA. Horizontal gene transfer between eukaryotes and prokaryotes has been studied in recent years, and more detailed studies are needed in the future.
A recent study reported that the copy number of ankyrin repeats is higher in E. acroporae genomes than in other marine organism-associated Endozoicomonas genomes [10], implying that some eukaryotic-like genes play essential roles in the symbiosis of Endozoicomonas with Acropora coral. On the basis of this idea, we hypothesized the functions and host adaptation strategies of each Endozoicomonas clade based on the functions of enriched eukaryotic-like domains. Recently, it was hypothesized that ephrin ligands are required for E. montiporae to transition the symbiotic mechanisms [14]. In our previous report, we have also confirmed a gene expansion of ephrin ligand in Endozoicomonas Clade B (Endozoicomonas sp. ISHI1 and SESOKO1) with a higher degree than that in E. montiporae [60]. Interestingly, this is consistent with the fact that the number of ephrin-like receptor genes is expanded in host Acropora coral compared with that in Montipora coral [75]. These results reinforce the hypothesis that Endozoicomonas uses ephrin ligand gene as a part of its adaptation strategy to its host coral. The consistency of ephrin ligand and host ephrin receptor gene expansion suggests the co-evolution between Endozoicomonas and host coral. Furthermore, Endozoicomonas clade B encodes genes associated with intracellular invasion, such as SNARE domains, which contribute to the inhibition of degradation by phagosomes in Chlamydia and Legionella [76]. However, Endozoicomonas clade B does not encode immune-related genes. This result is consistent with our recent finding that Endozoicomonas sp. ISHI1 stimulates immune-related pathways in corals through infection (Fig. 5A) [60]. In addition, genes containing the MATH/TRAF, SOCS box, and inhibitor of apoptosis domain were present in clade A1, but not in clade B. These genes suppress apoptosis induced by the innate immune pathway, such as the NF-κB and JAK-STAT pathways [77]. Recently, several studies reported that the innate immune pathways are activated in host corals as stress responses to bleaching [78]. In particular, the NF-κB pathway is conserved in cnidarians and has attracted much attention in the analysis of bleaching [79][80][81]. Therefore, we speculate that a group of eukaryotic-like genes that are significantly enriched in Endozoicomonas clade A1 may affect host innate immune pathways and increase tolerance to environmental stress.
Homology-based analysis also revealed several eukaryotic-like genes with unknown functions in Endozoicomonas clade A1 and B. These genes are phylogenetically conserved and may have some function in host adaptation and symbiosis. One example is the low-density lipoprotein receptor (LDLR) gene conserved in clade A1. Recently, the function of LDLR during oogenesis in A. tenuis has been reported; it has been shown that the transcript of AtLDLR is present in the mesentery, mesenteric membrane, and mesenteric filaments surrounding an oocyte, but not in the membrane of a developing oocyte. AtLDLR can bind to major yolk lipoproteins and may be transported into the cytoplasm of an oocyte as it develops [82].
This indicates that the LDLR-like gene of Endozoicomonas clade A1 has the binding function to host egg lipoprotein. Therefore, we speculate that the LDLR-like gene, which is involved in egg development, may contribute to vertical transfer of Endozoicomonas, as other infection-related genes such as ephrin ligand gene and SNARE gene have not been identified in clade A1. The speculation can be verified by future experiments.

Conclusions
We succeeded in the targeted single-cell genomics of Endozoicomonas from A. tenuis corals with droplet microfluidics. Thus, we accomplished obtaining highquality Endozoicomonas SAGs (clade A1 and B) from the same coral species, which were proved to be phylogenetically and functionally diverse. Comparative genome analysis revealed the difference in the eukaryotic-like gene profiles, suggesting that the symbiotic strategy of Endozoicomonas in A. tenuis is divided into two types. Our findings suggest that even in the same bacterial genus, each bacterial lineage has different gene profiles and functions. Therefore, it is necessary to conduct strain-level genome sequencing to analyze the functions and strategies for host adaptation. Although in vitro studies in the field environment are required to prove this hypothesis, our targeted single-cell genomics could be a powerful tool for acquiring SAGs of uncultured bacterium to infer its function. Analysis of various coral-associated Endozoicomonas SAGs will clarify their host adaptation strategies in more detail and shed light on some aspects of the symbiotic mechanisms.

Supplementary Information
The online version contains supplementary material available at https:// doi. org/ 10. 1186/ s40168-022-01395-9.  Table S1. List of Primer sets used to identify Endozoicomonas. Table S2. Raw read statistics of our acquired Endozoicomonas genus genome. Table S3. List of bacterial secretion systems conserved in Endozoicomonas. Table S4. Percentage of proteins released by the bacterial secretion system. Table S5. List of eukaryotic-like domains in Endozoicomonas. Table S6. List of all eukaryotic-like genes in Endozoicomonas.

Fig. 5
Proposed symbiotic strategy of Endozoicomonas acroporae in A. tenuis, as estimated by comparative genome analysis. A Relationship between clade B Endozoicomonas and A. tenuis corals. Endozoicomonas possesses many genes related to intracellular invasion, such as SNARE domain-containing genes, in addition to the eukaryotic-like gene ephrin ligand, but it does not possess any genes related to apoptosis. B Relationships between Endozoicomonas clade A1 and host A. tenuis corals. Endozoicomonas clade A1 possesses several eukaryotic-like genes involved in the suppression of apoptosis but does not have the coral-like ephrin ligand